RECEIVER WITH AUTOMATIC SKEW COMPENSATION 



Technical Field 

The present invention relates to the communication of signals, in particular, to 
5 the transmission and reception of digital signals. More specifically, the present 
invention relates to both the static and dynamic compensation of skew in high speed 
communications channels or interfaces. 

The present invention is particularly applicable to interfaces between 
integrated circuits and for high speed communications which require dynamic skew 
10 compensation. 

Background of the Invention 

One common form of communication system involves digital signals 
representing data which is sent over wires or other communication media, called a 
communication channel. Since the distances between a transmitter and a receiver 
15 may be relatively large, the digital signal carried via the communication channel may 
pick up "glitches" or "noise". 

At present, various factors are known to limit the maximum data rate of a 
digital receiver, among which are; 

- timing uncertainty in the input signal; 

20 - the phenomena known as metastability within the receiving registers, 

which is in modern CMOS systems in reality, phase noise intemal to 
registers; 

- the noise in the channel, including the phase noise of the clock 
synthesizer or recovery system; 

25 - the required Bit Error level. 

These problems have been addressed in the prior art by several approaches. 
One approach has been to use a digital data receiver including an analog 
filtering section that conditions an input signal. The analog filtering section removes 
noise and unwanted frequency components from the signal. In a conventional digital 
30 receiver, the filtering circuit has a fixed bandwidth that is set to accommodate the 



anticipated baud rate of the incoming signal and to optimize the signal quality and the 
quality of the received data. 

Signal quality is adversely affected by both intersymbol interference (IS!) and 
adjacent channel interference (ACI). Analog filtering circuits are commonly applied to 
reduce ISI, ACI, or other electronic noise associated with digital signal transmissions. 
IS! is reduced when the filter bandwidth is widened and ACI is reduced when the 
bandwidth is narrowed. Unfortunately, conventional fixed bandwidth filters inherently 
increase the amount of ISI when they are tuned to reduce ACI, and vice versa. As 
such, conventional analog filtering circuits in digital receivers are usually tuned to a 
less-than-optimum bandwidth with respect to ISI and ACI, which are often unknown a 
priori. 

The bandwidth accuracy of conventional tunable analog filters is only about 
10%. Although such accuracy may be sufficient to enable a digital receiver to gain 
symbol synchronization, the bandwidth inaccuracy may produce an unacceptable bit 
error rate (BER) resulting from excessive ISI or ACI. To minimize the BEF^ in some 
applications, it may be necessary to maintain bandwidth accuracy to within 5% or 
less. Unfortunately, conventional fixed bandwidth filters are not responsive to 
fluctuations in BER, ISI, or ACI. 

We will now consider in detail the effects of the different noise sources on the 
signal, when viewed over a short period of time, that is, without environmental 
changes. For clarity and ease of understanding, this field is described using 
elementary probability theory, which is a tool used widely in the engineering 
management of these problems. This theory is often taught pre-university, and 
expanded as a first year introductory topic for electronic engineering courses, and 
those versed in the field will be intimately familiar with this. 

Data errors in a channel with Gaussian distributed phase and amplitude noise 
can be considered as a noiseless ideal channel and with noise assigned to a clock 
signal, which gives rise to the probability distribution of the sampling point as shown 
on 3. Symbols SO, S1 and S2 represent symbols on the input of the receiver, which 
samples the data at a point in time which is symmetrically distributed around the 
moment x according to Gaussian distribution and described by the fomiula: 



3 



1 

So here we have a channel, with three subsequent symbols, SO, S1, and S2. 
In Figure 3, the distribution in time of the sampling point for 81 is shown, but in 
reality, each symbol has a similar curve, so we can consider the data stream as a 
5 series of symbols, each of which is sampled by a series of distributions. This Is 
shown clearly in Figure 5. 

The Bit en^or rate (BER) can be calculated as a probability to sample wrong 
symbol and it is equal to probability to sample other than S1 channel symbol (dashed 
Y area in Figure 3) multiplied by the probability that symbol S1 has a different value, 

C:= 10 which for binary coding with equally distributed zeros and ones is equal to 0.5. This 
15 can be described by the formula: 

~ For the distribution shown in Fig.3, the BER function is shown in Fig. 4. 

=L The BER curve has a minimum In the middle of bit interval, as shown in Figure 

'15 4 for one symbol. For a series of symbols, this BER curve becomes a periodic 
function with a period equal to one bit Interval. This is shown in Fig. 5. 

The value at the minimums depends on the distribution width a. A graph of 
resulting function is shown in Fig. 6. 

The signal to noise ratio can be calculated in dB, for bit width w and F^MS jitter 
20 according to the fonnula: 

For a single flip-flop, the probability to capture a logic state (either from a 0 to 
a 1 , or a 1 to a 0) is a function of the time difference between the sampling point and 
the point where input signal crosses the threshold. This function can be 
25 approximated as following; 
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where P(x) is a probability to capture the correct logic state, 

X is a time difference between the moment when the input signal crosses the 
threshold and the sampling point, 

a is the RMS value of noise in a system, that is the congregate of noise in 
channel, driver and receiver. 

Fig.7 is a diagram showing a plot of this probability function taken from an 
interface implemented using SSTL16857 registers as the solid line, and the 
theoretical function as the dotted line. In this case, the value of a is 21 pico seconds, 
from observation of the measured signal with its noise. This distribution is 

P{x) = l-P(-x). 

In addition to the noise distribution of the signal, we must consider the effect of 
environmental changes, which cannot be considered by the same BER analysis, 
because the time period needed to consider the environment is of many orders of 
magnitude longer than the time period involved in the consideration of phase and 
channel noise. 

In a communication channel, the integrity of the received data can be 
observed using an eye diagram, such as in Fig. 2. The eye in the very centre is the 
region where the data is stable and is strobed. The eye diagram shows time in the X 
domain, In picoseconds in Fig. 2, and voltage or current in the Y domain, in mV in 
Fig.2. To receive data securely, it is necessary to sample the data (that is, close a 
gate in the time domain), with the switching threshold of the gate as close as possible 
to the centre of the eye. A technique for tracking the centre of the eye in the voltage 
or current domain is described in US patent application 60/315,907. The present 
invention relates to how the eye is tracked in the time domain. 

The problem addressed by this innovation arises in very high speed systems, 
where each signal can move in time due to changes in the environment, in addition to 
movement due to channel noise, as has been already considered. For example. If a 
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signal switches at 10GHz, then the effect of someone putting their hand close to the 
signal track may cause the signal to move In time by more than a clock period, 
similarly if the signal is travelling down a cable and the cable is bent then the signal 
will take more or less time to arrive. Low frequency noise, vibration, temperature drift, 
5 loading, power supply voltage changes, and other sources, all have the effect of 
skewing the signal. This means that the static picture represented by the eye 
diagram is not representative of the dynamic environment. The environmental 
change can be considered as a long term shift of the entire probability distribution of 
the channel, that is the shift of the series of distributions shown in Fig. 5. As this 
10 distribution shifts, if the sampling point is fixed in absolute time then the errors 
increase: the signal is no longer sampled at the minima of the BER curves, so the bit 
errors increase as a function of the shift. Even small shifts can completely destroy 
the ability of the channel to communicate any data at its maximum data rate. 

Several techniques are known in the art to track and optimize the data sample 
' 15 position. These include integrating the eye pattern transitions over a longer period of 
time. Some clock sampling schemes use only an initial transition reference to prevent 
tracking the clock sample position into a less advantageous portion of the eye 
pattern. 

According to US 6,111,911, a high degree of chip code synchronization is 
20 used to clock the data bit decision. Transmitters transmit a data bit in synchronization 
with the chip code pattern, therefore allowing chip position to be used as a cue to the 
associated data bit position. Since the optimal position in which to sample a data bit 
is known, that portion of the Bit Error Rate loss is eliminated. Empirical results from 
this technique have shown practical improvements in the error rate versus carrier-to- 
25 noise ratio in the minimal detectable signal case. This technique is applicable to any 
direct sequence spread spectrum system In which a high degree of synchronization 
is inherently achieved, provided that the data is transmitted in synchronization with 
the chip code clock. 

However, very often, in particular, in high speed communications, such a 
30 synchronisation is not effective, while the Bit Error rate is defined by the current 



application system requirements. The more strict are these requirements, the lower is 
the data rate providing the desired Bit Error level. 

A special case of this applies to where a communication channel uses clock 
recovery, that is, the clock is recovered from the signal, and this is used to latch the 
received data. This approach does, to a limited degree, reduce the effect of low 
frequency noise, such as environmental changes. However the problem with this 
approach is that the entire error in the clock recovery system or the phase detectors 
is added to the noise in the channel and for very high frequency applications, this 
inaccuracy becomes a significant problem. 

Object of the present invention. 

It is therefore a primary object of the present invention to provide an improved 
system for the communication of digital data in a noisy channel. 

It is another primary object of the present invention to compensate statically 
and dynamically for the skew caused by the channel noise, production tolerances 
and variations in channel length. 

It is another object of the invention to provide an improved, economical 
apparatus for transmitting and receiving data at high bit rates required for chip-to chip 
and high speed digital communications. 

It is yet another object of the invention to provide an improved, highly 
accurate and reliable reading of data at high speeds suitable for the processing of 
digital signals In communication systems. 

It is a further object of the invention to provide an improved and highly 
compact receiving circuit with low timing uncertainty that can be economically 
implemented in a semiconductor integrated circuit. 

It is another object of the invention to provide an output interface for a digital 
receiver that provides the data flow through the receiver with a transmission rate of 
the signal at a low bit error level. 

It is a further object of the current invention that the channel reduces the 
production tolerances needed for its implementation by virtue of the system adapting 
to the environment in which it operates. 



It is a further object of the current invention to reduce the timing en-ors in the 
clock recovery process in a serial communication link. 

These and other objects of the present invention are attained by a receiver 
employing a plurality of samplers coupled to a plurality of comparators, whereby the 
characteristics of the channel are used to compensate for skew within the channel by 
altering the timing characteristics of the signal. 

By comparator, we mean a logic function which produces an output 
proportional to the similarity of one input to other inputs, or its complement. The 
comparators under consideration here produce the value of the number of the inputs 
which mismatch with those that are in the state of the majority. The ver\' simplest 
comparator Is a two input XOR (Exclusive OR) function, and for a three input 
element, the logic function (E) is shown in Fig. 11. 

A particular form of the invention is suitable for transmitting digital data at 
Rapid 10, 3GI0, Infiniband, Gigabit Ethernet and other high speed communications 
standards. 

Summary of The Invention 

The present invention relates to a device and method employing the switching 
characteristics within the receiving registers to determine the characteristics of the 
channel and to compensate for skew within the channel by altering the timing 
characteristics of the signal. The present invention involves various applications of 
the same innovation: the reduction of timing error by combining a plurality of registers 
to produce a composite register with a reduced level of internal noise. 

In its most basic form, the invention applies a plurality of registers in such a 
way that their probability distributions are combined, such that the overall distribution 
is narrower than the distribution of any one of the registers acting alone. A register in 
this context is generally, but not necessarily, a data sampler, and may have only 
transitive register characteristics such as a dynamic flip flop or storage gate. 

The invention comprises a series of registers which sample the data, each 
register slightly offset in time, for example, with a variable delay between registers, 
such as in Fig. 8, or static delays as in Fig. 9. In the very simplest embodiment, there 
need be no distinct delay element, because when a set of registers is triggered at the 



same instant in time, their intemal phase noise will cause them to latch at different 
points in time, as a function of the distribution which is shown in Fig. 3. 

In a more refined embodiment, the present invention spaces the plurality of 
registers in time using delay elements, or wire with inherent delay, and then applies 
the outputs of these registers to a logic network to determine which register have the 
lowest bit error rate. This set of delay elements can be Implemented using a 
polyphase clock generator to equalise the space between registers. 

Thus, in one aspect of the invention, a receiver is provided, comprising a 
plurality of samplers for sampling data, coupled with a set of delay devices for 
providing a series of signal copies with each copy being shifted by a predetermined 
time interval, at least one means for comparing signals latched by said samplers, a 
means, such as multiplexer, for choosing a signal copy with minimal BER, and a 
means, such as state machine, for determining the number of the signal copy with 
minimal BER, and optionally, a pipeline for latency adjustment. 

In another aspect of the invention, a receiver comprises a plurality of samplers 
for sampling data, providing a series of simultaneous signal copies, at least one 
means for comparing signals latched by said samplers, a means for choosing a 
signal copy with minimal BER, a means for determining the number of the signal 
copy with minimal BER, and optionally, a pipeline for latency adjustment. 

In still one more aspect of the invention, a receiver comprises at least one 
sampler for sampling data coupled with a set of delays, or a variable delay, providing 
a series of spaced in time signal copies, at least one means for comparing signal 
copies, a means for selecting a signal copy with minimal BER, a mieans for 
determining the delay corresponding to this copy, and a means for applying the 
obtained delay to other samplers when sampling data. 

The proposed receiver provides the high speed transmission of data, wherein 
the data transmitted are latched at the moment when the signal has the maximal 
stability. 

Preferably, the samplers are implemented as registers, flip-flops, latches, 
track-and-hold, sample-hold devices, etc. 
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Preferably, comparators are implemented as XORs as in Fig. 10, or as 
majority elements, or such using circuitry such as shown in Figure 1 1 to create an 
error output (E) which is the number of bits which differ from the majority of the input 
bits, shown in Figure 1 1 for three inputs. 

In another aspect, a method of high speed communication is provided 
employing the characteristic of metastabllity, that is phase noise internal to the 
register, within the receiving registers to measure the characteristics of the channel 
and to compensate for production tolerances within the channel by altering the timing 
characteristics of the signal. 

In still another aspect, a communication channel employing a receiver of the 
present invention is provided. 

Brief Description of The Drawings 

For a better understanding of the present invention and the advantages 
thereof and to show how the same may be carried into effect, reference will now be 
made, byway of example, without loss of generality to the accompanying drawings in 
which: 

Fig.1 shows a block diagram of an extended embodiment of the present 
invention to form a receiver; 

Fig. 2 shows an eye diagram for a channel running at 12.5Gbps with an eye 
opening amplitude of 20mV and 55ps. 

Fig.3 shows the sampling point distribution for a bit S1 in a serial data stream. 

Fig. 4 shows a Bit Error Rate Distribution in accord with position in time inside 
the bit frame of the actual sample point. 

Fig. 5 shows the series of Bit Error Rate Distributions for a serial data stream. 

Fig.6 shows the level of the Bit Error Rate where the sampling point is on the 
minima of the Bit En-or Rate Distribution, as a function of the ratio of bit interval to 
RMS channel noise. 

Fig. 7 is the theoretical (dotted) and experimental (solid) probability to capture 
a logic state moving from 0 to 1 as a function of the time difference betv\/een the 
sampling point and the point where input signal crosses the threshold, in the case 
where the sampler was implemented using a SSTL1 6857 register. 
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Fig. 8 shows a block diagram of an embodiment of the present invention using 
variable delays between the samplers. 

Fig. 9 shows a diagram of sampler 2 shown in the Block Diagram in Fig. 1. 

Fig. 10 shows a transition detector according to one of the example 
embodiments of the invention. 

Fig. 1 1 shows a three input logic block to create an error output (E) which is 
the number of bits which differ from the majority of the input bits, and Q, which is the 
majority element output, E being the orthogonal function of Q. 

Fig. 12 shows the family of functions of the Bit Error Rate Distribution for a 
series of bits, on the output of the majority element, for different widths of the majority 
element, where all noise is external to the sampler. 

Fig. 13 shows the family of functions of the Bit Error Rate Distribution for a 
series of bits, on the output of the majority element, for different widths of the majority 
element, where all noise is internal to the sampler. The importance of this can be 
understood more clearly from Fig. 14, which shows the same curves with a linear 
scale, rather than a log scale and with a scaling. 

Fig. 15 shows the BER against the number of samplers per bit, equally 
distributed across the bit interval, as a function of the ratio of the bit interval to the 
RMS noise. 

Fig. 16 shows the family of curves for probability of the output transition 
sensor output being a 1 where 16 clock phases are used to control the time interval 
between samplers. Each curve in this figure is for a particular ratio of bit interval to 
RMS noise. 

Fig. 17 shows the effective baud rate for a channel according to the current 
invention as a function of the size of the packet (for each curve, the packet size is in 
bits), for an example channel with lOps RMS noise. 

Figure 18 shows the same information as Figure 17, but with 64 bits of 
protocol overhead deducted from each packet, to give a family of curves showing 
actual data rates excluding the protocol, under the same conditions of lOps RMS 
noise. 
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Detailed Description Of The Invention 

The invention will now be described in detail without limitation to the generality 
of the present invention with the aid of example embodiments and accompanying 
drawings. 

The very simplest embodiment of the present invention comprises several 
samplers used in parallel with majority logic at the output. This will have the effect of 
combining the BER probability distribution, such that if the samplers are of a similar 
type, then the resulting BER distribution is narrower than for any of the individual 
samplers. The sampler in this instance would normally be a flip flop, a simple type of 
register. The logic to combine these registers is shown in Fig. 1 1 for three flip flops. 
The advantage of increasing the number of flip flops required is Illustrated later in a 
more sophisticated embodiment, but the same principle applies for all embodiments 
of the present invention. 

A second embodiment of the present invention uses the same principle to 
implement a single bit self-calibrating receiver as provided in Fig. 8, with 3 monotonic 
delay verniers 61, 62 and 63, a transition detector 66, two samplers with pipeline 
adjusters 67 and 68, controller 69 and output multiplexer 70. 

The controller in this case can be a comparatively simple state machine which 
continuously scans the vernier at the input of the transition detector and measures 
and stores values corresponding to the minimums of that function. The preferable 
range of these verniers should be not less than two channel symbol intervals to allow 
more than one local minimum. Scanning need only be provided at a low frequency, 
such as 20KHZ, allowing easy filtering of the received data from the transition 
detector signal. 

At the end of each cycle of scanning the vernier at the input of transition 
detector, the co-ordinate of the value closest to the middle minimum is loaded into 
one of verniers at the input of sampler. Both samplers work consecutively. When 
scanning is finished and a new value of the position of minimum is determined, the 
spare vernier is placed onto the corresponding position and then output miultiplexer 
switches to that channel. If the new position of the minimum belongs to the different 
bit an appropriate pipeline adjustment must be provided. Depth of the pipeline 
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adjusters should be enough to cover all possible skew values. The initial position 
after power up or reset should be in the middle. 

Continuous monitoring of the input allows timing uncertainty to be 
compensated at the input, including due to drift or low frequency noise due to 
environmental variations. 

The sampler can be implemented in a different ways. The simplest is a single 
flip-flop, but to increase performance or reduce the Bit Error Rate, several flip-flops 
can be used in parallel with majority logic at the output which will be equal to one if 
more than half of the inputs are equal to one. An odd number of flip-flops shall be 
used with a total quantity 2n+1 . The resulting Bit Error Function is described as: 

BER„ (X) = X C,„,/ X [p* {X) X (1 - P{x)f"^'-' ] 

k=n+\ 

Plots of the different resulting Bit Error functions are provided in Figures 13 
and 14. The choice of the number of samplers is determined from the BER curves, 
in particular a plot such as shown in Figure 15, where BER is plotted against the 
number of samplers, for various amounts of noise: each curve in Fig. 15 is for a 
particular ratio of bit interval to RMS noise. This shows that 16 samplers Is sufficient 
to operate with a bit interval to RMS noise ratio of 8, such as a channel with 10ps 
RMS jitter with a 80ps bit interval. Reducing this value, will according to the curves in 
Fig. 15 to less than 16 samplers, will increase the bit error rate of the channel. 

To enable the raw Bit Error Rate from the channel implemented according to 
the present invention, to be used effectively without data errors, en-or correcting 
codes such as Viterbl or blocking codes should be used, with either error correction 
or retransmission of the data in the event of a bit error. The channel payload curves, 
such as shown in Figures 17 and 18, are used to determine the useful data capacity 
of the channel incorporating these error detection or correction techniques. 

A plurality of units described can be used for implementing a wide parallel bus. 
In this case after power up an extra procedure is used for correcting the depth of 
Pipeline adjusters on the different bits to achieve the same latency. There are many 
ways to align bits, such as described in standard protocols like Infiniband. A simple 
solution is to use an zeroes to all ones pattern, but for complex skew adjustment 
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such as the pattern dependent adjustment described in other patents by the same 
inventors, the gating function of the present invention may be used to select 
individual bits In a data stream. 

For better stability, coding is preferably used to limit the space between 
changes of state or toggles. An appropriate means to do this is using 8b/10b 
encoding, which is widely used in the industry to achieve a DC balanced code, with a 
limited frequency bandwidth by enforcing changes in data polarity using encoding 
techniques. 

In Fig. 1 , a block diagram of an third and improved embodiment of a receiver 
according to the invention is shown. Preferably, the receiver comprises samplers 2, 
majority elements and transition detectors 3, 4, 5, data selector 6, controller 7 and a 
pipeline latency adjustment elements 8 which operates as a FIFO. 

Preferably, samplers 2 are implemented as a set of registers for latching data 
illustrated in more detail in Fig.9. As shown in Fig.9, registers 31, 32, 33, 34 are 
coupled with a set of delay devices 35, 36, 37 for providing a series of signal copies 
with each copy being shifted by a predetermined time Interval. These registers 
provides a signal at different points of time, according to the continuous BEF^ function 
shown In Fig. 5. 

Samplers can be also implemented in other ways. The simplest is a single flip- 
flop but to increase perfonnance or reduce the Bit Error Rate, several flip-flops may 
be used in parallel with majority logic at the output according to the most basic 
embodiment of the current invention. That is, the invention can be applied in a 
nested manner. 

The outputs of samplers 2 are connected to the inputs of majority elements 
3,4,5, where the output of each of the majority element is equal to "1" if more than 
half of the inputs are equal to "1", and "0" if more than half of the inputs are equal to 
"0". An odd number of samplers shall be used in conjunction with each majority 
elements with a total quantity 2n+1 . 

A receiver as shown in Fig. 1 according to the present invention comprises a 
set of logic elements 3, 4, 5, for providing a value Q corresponding to the value at the 
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majority of its inputs (DO, D1, D2) and a number E of inputs having value different 
from the value at the majority of inputs. 

A detailed example of these logic elements for k=3 is shown in Fig. 11, and it 
is a simple matter to expand this to cover any number of Inputs using the majority 
function. The techniques for expanding logic functions are widely disseminated. For 
even number of inputs the function is simply an XOR. The logic function is that when 
all inputs are zero or all inputs are one, the output is zero. When only one 1 input is 
zero or only 1 input is one, then the output is 1. When only two inputs are one or only 
two inputs are zero and the number of inputs is more than 3, then the output is 2, and 
so on. This logic can be synthesized by standard tools, such as those from Synopsis 
and other EDA vendors, or can be derived by hand without difficulty. 

The logic element in Fig. 11 consists of three AND elements 41, 42, 43 
coupled to an OR element 47 which gives a value Q corresponding to the value at 
the majority of inputs of AND elements 41, 42, 43, and NAND element 44 and OR 
element 45 coupled to AND element 46 which gives the amount E of AND elements 
having input value different from the value at the majority of inputs. 

The receiver in Fig. 1 further comprises a data selector or multiplexer 6 for 
choosing a copy of the signal with minimal BER, a state machine 7 for determining a 
number of the copy with minimal BER, and a pipeline 8 for latency adjustment. 

According to the invention, for a better performance of the commiunlcatlon 
channel, the bit interval is covered by several samplers spaced in time, wherein the 
sampler that is closest to the minimum in BER function is preferably chosen as the 
sampler used for data receiving. 

A particularly useful method of spreading samplers In time is to use a 
polyphase clock. Clock trees can generate a polyphase clock by virtue of their delay, 
or the clock can be implemented using a ring oscillator with each clock phase being 
taking from each inverter stage of the oscillator. Some extra phase splitters can be 
used for finer granularity. With the polyphase clock, the sampling point of each of the 
registers is spread in time by virtue that they are clocked at slightly different 
instances in time. 
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Another useful aspect of the present invention, is that the outputs from the 
samplers themselves indicate over a number of cycles, the DC bias in the signal. 
This information can be applied using the invention described In US patent 
application 60/315,907 to track the voltage or current threshold within the eye 
diagram. 

The application of the sampler outputs to achieve this purpose should be 
apparent to someone skilled in the art of signal processing, but in summary, when 
the bit stream is encoded with a DC balanced code such as phase modulated codes, 
8b/10b encoding, or 16b/20b encoding, then the value of each of the samplers 
should be 50 percent Is and 50 percent Os. If the average amount of Is is more than 
50% then the threshold should be increased such as by lowering the terminating 
voltage or controlling the reference in a differential stage. If the average number of 1s 
is less than 50%, then the threshold is too high, and the reference voltage should be 
lowered. Similar compensation can be implemented with current mode systems. Just 
using one register and averaging over a number of cycles gives a loop response 
which can be longer than the period of the noise, particularly in real systems where 
the noise can be caused by other logic, such as power supply noise - in modern low 
voltage DC to DC converters these are already operating at frequencies of around 
10MHz, so rapid adjustment of the threshold is needed. The present invention gives 
the input data from each single clock to perfomi this adjustment: if the samplers are 
spread in time, then their outputs will be distributed by a function that be 
approximated to be the integral of a Gaussian function for each data transition, that is 
a symmetrical function around the threshold, such as the 0.5 level in Fig. 7. Any 
tendency for the threshold in the eye diagram to move, is seen immediately by the 
imbalance in the distribution of these samples, allowing the eye of the eye diagram to 
be tracked in the Y domain on a cycle by cycle basis, in parallel with the normal 
operation of the channel. 

Operation 

The operation of the present invention in its most basic form can be easily 
understood by a specialist in the art, and can be aided using tools such as MathCAD. 
The operation of the more complicated embodiments can be understood by 
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considering tlie function of the receiver shown in Fig. 1. The operation of this receiver 
will now be described, without loss of generality. 

To identify the position in time at which BER function is minimal, several 
approaches can be used. By spreading the samplers in time, information on which 
direction the signal is moving in time can be determined, and this information can be 
used by the controller to introduce pipeline delays and to track the eye of the eye 
diagram over multiple clock cycles. It is not essential to have these samplers spread 
in time by more than one bit period, or even a bit period. 

If the sampler with the lowest bit error rate moves to the upper boundary, then 
it shall wrap to the first sampler to continue to move to the sampler with the minimum 
bit error rate, then it is required to capture two bits in one cycle: on fronn the first 
sampler, and one from the last sampler, and take data from the first sampler in the 
subsequent clock cycles. 

If the sampler with the minimum bit error rate moves to the lower boundary, 
the opposite is performed, with one sample being dropped by jumping from the first 
sampler to the last sample on two sequential clock cycles. 

However, if the time delay between the samplers is not well defined, then 
extra samplers can be added to provide an overlap between subsequent bit intervals. 

One approach according to the invention is to use several samplers per Input 
line with a difference in delays from the input to the sampler. These delay elements 
can be implemented in a data path, in a clock signal path, or in both paths 

According to the example embodiment shown in Fig. 9, each flip-flop 31, 32, 
33, 34 takes independent samples of their input in different moments of time covering 
an interval wider than one bit symbol interval. 

Each flip-flop can be defined by a function P(x + Xn), as shown in Figure 3 
where Xn is a difference In sampling points between the first sampler and sampler n 
as: 

Each k subsequent inputs are passed to logic element 3, 4, 5. The E output of 
each logic element is passed to a state machine 7 which determines the logic 



element with minimal error level. The number of this element is passed to output 
multiplexer 6, which passes data signal Q from that element to the output. The state 
machine 7 counts 1s from each of the logic elements, 3, 4, 5 etc, in a certain period 
of time. It then compares the counts to find the channel which generates the lowest 
number. This channel number is coded and passed to the data selector 6, so data is 
selected from that sampler and passed to the output pipeline adjuster, used as a 
FIFO 8. This FIFO can, in a preferred embodiment, pick up none, one or two symbols 
in a cycle, to allow the samplers to be wrapped as already explained for when the 
sampler with the lowest BER is moved across the bit frame boundaries. 

The state machine 7 also functions to adjust pipeline depth at the output of the 
receiver when new selected majority element is one bit interval away from the 
previously used element. Thereby, continuous monitoring of the state machine inputs 
provides compensating timing uncertainty at the receiver's input and its drift or low 
frequency noise due to environmental variations. 

A single bit channel of the receiver according to the invention with k = 3 is 
shown in Fig. 5. A plurality of receivers can be used for parallel busses. In this case 
initial pipeline values shall be updated during initialization procedure to provide the 
same latency on each bit. 

Congregate sampler noise may be considered as independent for all 
samplers. A fraction of this noise, which is caused by the sampler itself, is 
independent from each other while noise created by clock generator, signal 
transmitter and channel media are applied to all samplers simultaneously. 

To analyze the technical effect achieved by using majority elements, both 
utmost alternatives when the fraction of the sampler noise is 100% and 0% shall be 
considered. 

When the sampler inherent noise is 100%, the BER value at the output of 
majority element depends significantly on the number of samplers used for that 
element as shown In Fig. 6 for k=1,3,5. In this figure, the upper curve is obtained 
using one sampler per each majority element, the middle curve, using 3 samplers per 
majority element, and the lower one, using 5 samplers. 
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When the sampler inherent noise Is negligible, the number of samplers used 
for majority function does not make any significant changes in resulting BER as seen 
in Fig. 7. 

Averaged and normalized E output of majority element also does not 
significantly depend on the number of majority element inputs as shown in F-ig. 8. 

From the expectation that the largest portion of noise belongs to driver, 
channel media and clock generator, it is clear that it is preferable to use minimum 
number of inputs at majority elements which is 3. 

The resulting BER value is different for different number of samplers equally 
distributed across bit interval and for different ratio between bit interval and RMS 
noise value. These functions are presented in Fig. 9, where the number of samplers is 
on horizontal axis and the ratio between bit interval and a is an index of BER 
function. It is clear from this picture that the optimal number of samplers per bit is 
close to 1 6. 

A simplified alternative arrangement is shown in Fig. 8. According to this 
embodiment, a single bit receiver contains three monotonic delay verniers 61, 62, 63, 
transition 66, two samplers 64, 65 with pipeline adjusters 67, 68, controller 69 and 
output multiplexer 70. 

The feedback loop or detector 66 is used to control the best sampling point 
position. For example this detector can be implemented as shown in Fig. 11. Two 
independent flip-flops 11, 12 are sampling of their inputs simultaneously. Each flip- 
flop is defined by the P(x) function described above. 

The state machine 69 continuously scans the vernier 63 at the input of the 
transition detector 66 and measures and keeps values corresponding to the 
minimums of that function. The preferable range of these verniers should be not less 
than two channel symbol intervals to allow have more than one local minimum. 
Scanning need only be provided at a low frequency, such as 20KHz, allowing easy 
filtering of the received data from the transition detector signal. At the end of each 
cycle of scanning the vernier at the input of transition detector, the co-ordinate of the 
value closest to the middle minimum is loaded into one of verniers at the input of 
sampler. Both samplers 64, 65 work consecutively. When scanning is finished and a 
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new value of the position of minimum is determined, the spare vernier is placed onto 
the corresponding position and then output muxer 70 switches to that channel. If the 
new position of the minimum belongs to the different bit, an appropriate pipeline 
adjustment must be provided. Depth of the pipeline adjusters 67, 68 should be 
enough to cover all possible skew values. The Initial position after power up or reset 
should be in the middle. 

Continuous monitoring of the Input allows timing uncertainty to be 
compensated at the input, including uncertainty due to drift or low frequency noise 
due to environmental variations. 

Thus, the present invention provides Improvements to the Bit Error rate versus 
channel and inherent register noise. This Improvement is a result of intelligent 
arrangement of circuit elements and employment of the characteristic of 
metastability, (by which we mean the probability distribution of the transition phase 
noise internal to a register), within the receiving registers to measure the 
characteristics of the channel and to compensate for production tolerances within the 
channel by altering the timing characteristics of the signal. 

The advantage of the present Invention is that the data bit is sampled at the 
optimal position and, thereby, it is possible, for a given Bit Error rate, to provide a 
system having a minimal bit interval, in which the data rate may be increased up to 
few a per bit, such as 4 a, where a is RMS value of noise in a system which is 
congregate noise in channel, driver and receiver. 

In another embodiment, the samplers and their associated logic can be 
pipelined, such as in FIFOs or by a datapath. 

At its most basic level, the present invention samples the data and then 
subsequently, the logic detennines what was the best time to have sampled that 
data, with full hindsight. This is a fundamental aspect of the sophisticated 
embodiments of the present Invention. This is quite contrary to contemporary 
methods, which require the connection of some extra detectors on the channel, or 
supplement receivers with sensors that try to compensate for the future changes in 
the channel as a function of past data. In the present invention we sample the data 
first and compensate later. 
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Another advantage of this invention is that the correction of the threshold is 
determined using the same samplers as for sampling the actual data, not a copy of 
those samplers. This means the correction that is applied can be as exact as 
required. 

All of the compensation that is described herein is preferably implemented 
using exclusively digital circuitry, even the threshold adjustment which can be a 
charge pump. 

Empirical results from the application of the present invention have shown 
large practical improvements in the error rate versus the congregate noise and 
considerably reduces timing uncertainty. 

In some logic families, a metastable state may cause oscillation of the register. 
Metastability is considered mathematically to be an asymptotic point in time, which as 
it is approached, the output of the register takes exponentially longer amounts of time 
to settle into a known state. This is true of phase noise where the outputs of the 
register are considered in aggregate over many samples. Another phenomenon can 
exist in logic families where the wire delays within the register are short in 
comparison to the gate switching speed, in which case a positive feedback state can 
exist. In this circumstance, as the metastable point is approached, the register can 
oscillate. This can be corrected by better layout, such that the registers used here 
exhibit a point of maximum phase noise at their mean transition point and do not go 
into self sustaining oscillation. 



